Skip to content

Fix IndirectAttackEvaluator results not displaying in AI Foundry portal#45721

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/fix-indirect-attack-evaluator-results
Draft

Fix IndirectAttackEvaluator results not displaying in AI Foundry portal#45721
Copilot wants to merge 2 commits intomainfrom
copilot/fix-indirect-attack-evaluator-results

Conversation

Copy link
Contributor

Copilot AI commented Mar 16, 2026

IndirectAttackEvaluator evaluations showed "Completed" in Foundry but no scores — results were only accessible programmatically. The bug: EvaluationMetrics.XPIA.value was changed to "indirect_attack" for service API communication, causing output columns to be named indirect_attack_label, indirect_attack_reason, etc. AI Foundry and the aggregation pipeline expect the xpia_ prefix established by the original schema.

Changes

  • _base_rai_svc_eval.py_parse_eval_result

    • Use "xpia" as the output key prefix for XPIA results regardless of EvaluationMetrics.XPIA.value ("indirect_attack")
    • Extract manipulated_content, intrusion, information_gathering as individual flat columns (xpia_manipulated_content, etc.) from scoreProperties, handling both snake_case and camelCase service response keys
    • Fallback: auto-rename any legacy indirect_attack_* keys to xpia_* for responses that already used the wrong prefix
  • _base_rai_svc_eval.py_evaluate_query_response

    • Pass metric_display_name="xpia" for XPIA so the legacy polling endpoint also produces correctly prefixed output keys via parse_response
  • _evaluate.py

    • _aggregate_label_defect_metrics: add "xpia" to handled_metrics so xpia_label columns are recognized and yield xpia_defect_rate aggregate metrics
    • _get_token_count_columns_to_exclude: add "xpia" to the known-metrics list so xpia_*_tokens columns are excluded from numeric aggregation

Result: evaluate() with IndirectAttackEvaluator now produces the expected column names and metrics:

# Before fix:
result["rows"][0]["outputs.indirect_attack.indirect_attack_label"]  # wrong
result["metrics"]["indirect_attack.indirect_attack_defect_rate"]    # wrong

# After fix:
result["rows"][0]["outputs.indirect_attack.xpia_label"]                  # ✓
result["rows"][0]["outputs.indirect_attack.xpia_manipulated_content"]    # ✓
result["metrics"]["indirect_attack.xpia_defect_rate"]                    # ✓
result["metrics"]["indirect_attack.xpia_manipulated_content"]            # ✓

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • pypi.org
    • Triggering command: /home/REDACTED/work/azure-sdk-for-python/azure-sdk-for-python/.venv/bin/pip pip install -q httpx (dns block)
    • Triggering command: /home/REDACTED/work/azure-sdk-for-python/azure-sdk-for-python/.venv/bin/python /home/REDACTED/work/azure-sdk-for-python/azure-sdk-for-python/.venv/bin/python /home/REDACTED/work/azure-sdk-for-python/azure-sdk-for-python/.venv/lib/python3.9/site-packages/pip/__pip-REDACTED__.py install --ignore-installed --no-user --prefix /tmp/pip-build-env-k6pgbg15/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i REDACTED -- setuptools>=40.8.0 (dns block)
    • Triggering command: /home/REDACTED/work/azure-sdk-for-python/azure-sdk-for-python/.venv/bin/pip pip install -q httpx devtools_testutils pytest pytest-asyncio de/node/bin/bash (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>IndirectAttackEvaluator not uploading/displaying results in AI Foundry correctly</issue_title>
<issue_description>- Package Name: azure.ai.evaluation

  • Package Version: 1.15.3
  • Operating System: MacOS
  • Python Version: 3.12

Describe the bug
There appears to be a problem with IndirectAttackEvaluator. After data has been simulated with query/response pairs, and then passed/uploaded to AI Foundry, the evaluation results do not appear in the Foundry portal even though the results (returned back programmatically) prove that the evaluation ran correctly.

It's not clear whether the problem is with the SDK or Foundry. This is a blocker for all RAI evaluations that rely on indirect jailbreaking using the IndirectAttackEvaluator class.

To Reproduce

import os
from typing import Any, Dict, List, Optional

from azure.ai.evaluation import IndirectAttackEvaluator, evaluate
from azure.ai.evaluation.simulator import IndirectAttackSimulator
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from openai import AzureOpenAI


azure_ai_project_endpoint = "<ai-foundry-project-endpoint>"
azure_endpoint = "<azure_endpoint>"
deployment = "gpt-5.1"
api_version = "2025-03-01-preview"

# sample application
def call_llm(
    query: str
) -> str:
    token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")
    client = AzureOpenAI(
        api_version = api_version,
        azure_endpoint = azure_endpoint,
        azure_ad_token_provider = token_provider,
    )
    result = client.responses.create(
        model = deployment,
        input = query,
    )
    return result.output_text

async def callback(
          messages: List[Dict],
          stream: bool = False,
          session_state: Any = None,
          context: Optional[dict[str, Any]] = None,
    ) -> dict:
    messages_list = messages["messages"]
    query = messages_list[-1]["content"]
    context = None

    # Send message to application and get a response
    try:
        response = call_llm(query)
    except Exception:
        response = None

    # Format response in OpenAI message protocol
    message = {"content": response, "role": "assistant", "context": context}
    messages["messages"].append(message)
    return {"messages": messages_list, "stream": stream, "session_state": session_state, "context": context}

# set up and run simulator
indirect_simulator = IndirectAttackSimulator(
    azure_ai_project = azure_ai_project_endpoint,
    credential = DefaultAzureCredential())

sim_results = await indirect_simulator(
    target=callback,
    max_conversation_turns=3,
    max_simulation_results=5,
)

# save simulated results to file
with open("indirect_jailbreak_example.jsonl", "w") as file:
    file.write(sim_results.to_eval_qr_json_lines())

# set up evaluator and evaluate the simulated jailbreak conversations
indirect_evaluator = IndirectAttackEvaluator(
    azure_ai_project = azure_ai_project_endpoint,
    credential = DefaultAzureCredential(),
)

eval_results = evaluate(
    evaluation_name = "example-indirect-jailbreak-evaluation",
    data = "indirect_jailbreak_example.jsonl",
    evaluators = {"indirect_attack": indirect_evaluator},
    azure_ai_project = azure_ai_project_endpoint,
)

Expected behavior
I expect to see the eval results/scores get reported and summarized correctly in Foundry. Currently no scores are recorded even though the object eval_results shows clear proof that the evaluator ran correctly.

After more testing, this class worked up through the v1.14.0 release. The problem began in the v1.15.0 release.

Screenshots

Image

Additional context
Add any other context about the problem here.
</issue_description>

Comments on the Issue (you are @copilot in this section)

@kashifkhan Thank you for the feedback and detailed repro. Ill forward this to the right team who can help you further

📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

…ry display

The IndirectAttackEvaluator was producing output keys with indirect_attack_ prefix
instead of the expected xpia_ prefix, causing results to not display in AI Foundry.

Root cause: EvaluationMetrics.XPIA.value was changed to indirect_attack for service
API calls, but output column names must use the xpia prefix for backward compatibility
with AI Foundry's expected column naming format.

Changes:
- _base_rai_svc_eval.py: Use xpia as output key prefix for XPIA results in
  _parse_eval_result, extract sub-metrics (manipulated_content, intrusion,
  information_gathering) as flat keys, add fallback to rename legacy
  indirect_attack_* keys to xpia_*
- _base_rai_svc_eval.py: Pass metric_display_name=xpia for legacy path
- _evaluate.py: Add xpia to handled_metrics in _aggregate_label_defect_metrics
  and to known metrics in _get_token_count_columns_to_exclude
- Add unit tests for XPIA-specific behavior
- Update CHANGELOG.md

Co-authored-by: needuv <40341266+needuv@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix IndirectAttackEvaluator results upload to AI Foundry Fix IndirectAttackEvaluator results not displaying in AI Foundry portal Mar 16, 2026
Copilot AI requested a review from needuv March 16, 2026 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IndirectAttackEvaluator not uploading/displaying results in AI Foundry correctly

2 participants